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Abstract 

In this paper, we propose a forecasting model for volatility based on its decomposition to sev- 
eral investment horizons and jumps. As a forecasting tool, we use Realized GARCH framework 
which models jointly returns and realized measures of volatility. Using jump wavelet two scale 
realized volatility estimator (JWTSRV), we first decompose the returns volatility into several in- 
vestment horizons and jumps and then utilise this decomposition in a newly proposed Realized 
Jump-GARCH and Realized Wavelet-Jump GARCH models. On currency futures data covering 
the period of recent financial crisis we moreover compare the forecasts from Realized GARCH 
model using several additional realized volatility measures. Namely, we use the realized volatility, 
bipower variation, two-scale realized volatility, realized kernel and jump wavelet two scale realized 
volatility. We find that in-sample as well as out-of-sample performance of the model significantly 
differs based on the realized measure used. When JWTSRV estimator is used, model produces 
significantly best forecasts. Our Realized Wavelet-Jump GARCH model proves to further improve 
the volatility forecasts. We conclude that realized volatility measurement in the time- frequency 
domain and inclusion of jumps improves the volatility forecasting considerably. 
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1. Introduction 

Much of the recent popularity of realized volatility is mainly due to its two distinct implications 
for practical estimation and forecasting. The first relates to the measurement of realizations of the 
latent volatility process without the need for any assumptions about the explicit model. The second 
brings the possibility of forecasting volatility directly through standard time series econometrics 
with discretely sampled daily data, while effectively extracting information from intraday high- 
frequency data. 
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The most fundamental result in realized variation states that it provides a consistent nonpara- 
metric estimate of price variability over a given time interval. The formalized theory is presented 
by Andersen et al. (2003). While these authors provide a unified framework for modeling, Zhou 
(1996) was one of the first to provide a formal assessment of the relationship between cumulative 
squared intraday returns and the underlying return variance. The pioneering work by Olsen & 
Associates on the use of high-frequency data, summarized by Dacorogna et al. (2001), produced 
milestone results for many of the more recent empirical developments in realized variation. A 
vast quantity of literature on several aspects of estimating volatility has emerged in the wake of 
these fundamental contributions. Our work builds on this popular Realized Volatility approach. 
While most time series models are set in the time domain, we enrich the analysis by the frequency 
domain. This is enabled by the use of the wavelet transform. It is a logical step to take, as the 
stock markets are believed to be driven by heterogeneous investment horizons. In our work, we 
ask if wavelet decomposition can improve our understanding of volatility series and hence improve 
volatility forecasting. 

One very appealing feature of wavelets is that they can be embedded into stochastic processes, 
as shown by Antoniou and Gustafson (1999). Thus we can conveniently use them to extend the 
theory of realized measures as shown by Barunik and Vacha (2012). One of the common issues 
with the interpretation of wavelets in economic applications is that they are filter, thus they can 
hardly be used for forecasting in econometrics. Our wavelet-based estimator of realized volatility 
uses wavelets only to decompose the daily variation of the returns using intraday information, 
hence this is no longer an issue. As the wavelets are used to measure realized volatility at different 
investment horizons, our approach can be used to construct a forecasting model based on the 
wavelet decomposition. 

Several attempts to use wavelets in the estimation of realized variation have emerged in the 
past few years. H0 g and Lunde (2003) were the first to suggest a wavelet estimator of realized 
variance. Capobianco (2004), for example, proposes to use a wavelet transform as a comparable 
estimator of quadratic variation. Subbotin (2008) uses wavelets to decompose volatility into a 
multi-horizon scale. Next, Nielsen and Frederiksen (2008) compare the finite sample properties of 
three integrated variance estimators, i.e., realized variance, Fourier and wavelet estimators. They 
consider several processes generating time series with a long memory, jump processes as well as 
bid-ask bounce. Gengay et al. (2010) mention the possible use of wavelet multiresolution analysis 
to decompose realized variance in their paper, while they concentrate on developing much more 
complicated structures of variance modeling in different regimes through wavelet-domain hidden 
Markov models. To complete the literature, Mancino and Sanfelici (2008), Olhede et al. (2009) 
propose estimators based on the Fourier transform. While the idea is very similar, this approach 
leads to realized volatility measurement in the frequency domain solely. 

One exception which fully completes the current literature on using wavelets in realized variation 
theory is the work of Fan and Wang (2007), who were the first to use the wavelet-based realized 
variance estimator and also the methodology for the estimation of jumps from the data. In Barunik 
and Vacha (2012), we revisit and extend this work in a several ways. Instead of using the Discrete 
Wavelet Transform we use the Maximum Overlap Discrete Wavelet Transform, which is a more 
efficient estimator and is not restricted to sample sizes that are powers of two. We also use the 
Daubechies D(4) wavelet filter instead of the Haar type. Moreover, we bring large finite sample 
study confirming the behaviour of the wavelet estimators and run a forecasting simulation where our 
estimator confirms to improve forecasting of the integrated variance substantially when compared 



2 



to other estimators. Finally, in Barunik and Vacha (2012) we attempt to use the estimators to 
decompose stock market volatility into several investment horizons in a non-parametric way. 

Motivated by these results, this paper focuses on proposing a model which will improve the 
forecasting of volatility. Similarly to Lanne (2007) and Andersen et al. (2011), we use the de- 
composition of the quadratic variation with the intention of building a more accurate forecasting 
model. Our approach is very different though, as we use wavelets to decompose the integrated 
volatility into several investment horizons and jumps. Moreover, we employ recently proposed 
realized GARCH framework of Hansen et al. (2011). Realized GARCH allows to model jointly re- 
turns and realized measures of volatility, while key feature is a measurement equation that relates 
the realized measure to the conditional variance of returns. We use several measures of realized 
volatility, namely realized volatility estimator proposed by Andersen et al. (2003), the bipower 
variation estimator of Barndorff-Nielsen and Shephard (2004), the two-scale realized volatility of 
Zhang et al. (2005), the realized kernel of Barndorff-Nielsen et al. (2008) and finally jump wavelet 
two-scale realized variance (JWTSRV) estimator of Barunik and Vacha (2012) in the framework 
of Realized GARCH and we find significant differences in volatility forecasts, while our JWTSRV 
estimator brings the largest improvement. 

The main contribution of the paper are two new specifications of the Realized GARCH model 
based on the volatility decomposition and jumps. First, we utilize jumps estimated by the JWTSRV 
estimator to build a Realized Jump-GARCH(1,1) model. Second, we add a realized volatility 
measured at several investment horizons and build Realized Wavelet Jump-GARCH(1,1) models 
expecting that our models will result in better in-sample fits of the data as well as in out-of-sample 
forecasts. We are motivated by the statistical properties of the decomposed volatility series, which 
suggest that each scale might carry somewhat different information. The empirical analysis shows 
that our newly proposed models bring significant improvement in volatility forecasts. 

The paper is organized in sections as follows. After the introduction, second Section reviews 
all the realized measures used in the forecasting excercise, third Section introduces our estimation 
of the realized variance and jumps using wavelets and Section four proposes a Realized Jump- 
GARCH(1,1) and Realized Wavelet Jump-GARCH(1,1) models. The fifth Section applies the 
presented theory, decomposes the empirical volatility of forex futures and finally uses the decom- 
position for forecasting. 

2. Realized variance 

We assume that the latent logarithmic asset price follows a standard jump-diffusion process and 
is contamined with microstructure noise. Let yt be the observed logarithmic prices at < t < T, 
which will be equal to the latent, so-called "true log-price process", dpt = (itdt + atdWt + £,tdqt, 
and will contain microstructure noise, 

Vt=Pt + e t , (1) 

where et is zero mean i.i.d. noise with variance r] 2 , qt is a Poisson process uncorrelated with Wt 
and governed by the constant jump intensity A. The magnitude of the jump in the return process 
is controlled by factor £ t ~ N(£, cr|). 

Quadratic return variation over the [t — h, t] time interval for < h < t < T, associated with 
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Pt, 

QV t , h = [ cr 2 s ds+ £ J l ( 2 ) 

ilzh t-h<i<t 
IVt > h ^lv^ 

can be naturally decomposed into two parts: integrated variance of the latent price process, IV t: h 
and jump variation JVt,h- As detailed by Andersen et al. (2003), quadratic variation is a natural 
measure of variability in the logarithmic price. 

A simple consistent estimator of the overall quadratic variation under the assumption of zero 
noise contamination in the price process is provided by the well-known realized variance, introduced 
by Andersen and Bollerslev (1998). The realized variance over [t — h, t], for < h < t < T, is 
defined by 

N 

RV t, h = E r lk + (^ ( 3 ) 

i=l 

where N is the number of observations in [t — h,t] and r t _ h+ /±\ h is i— th intraday return in the 

[t — h,t] interval. RVt t h converges in probability to IVth + JVt,h as N — > oo (Andersen and 
Bollerslev, 1998; Andersen et al, 2001, 2003; Barndorff-Nielsen and Shephard, 2001, 2002a ; b). As 
observed log-prices yt are contamined with noise in a real world and we are mainly interested in the 
IV^h part of quadratic variation, subsequent literature has developed several estimators dealing 
with both jumps and noise. 

2.1. Effect of micro structure noise 

Zhang et al. (2005) propose solution to the noise contamination by introducing the so-called 
two-scale realized volatility (TSRV henceforth) estimator. Authors propose a methodology for 
measurement of realized variance utilizing all of the available data using an idea of precise bias 
estimation. The two-scale realized variation over [t — h, t], for < h < t < T, is measured by 



slow time scale fast time scale 
(all) (average) 

where RV th is computed using Eq. (3) on all available data and RVt^ is constructed by 

W 

averaging the estimators RV t h obtained on K grids of average size iV = N/K as: 



—-(average) 
RV t,h R 

k 



In computing the TSRV, we have to first partition the original grid of observation times, G = 

{to, ■ ■ ■ ,ijv}, into subsamples G^ k \ k = 1,...,K, where N/K — > oo as — > oo. For example, 

will start at the first observation and take an observation every 5 minutes, G (2) will start at 

the second observation and take an observation every 5 minutes, etc. Finally, we average these 

-—-(TSRV) 

estimators through the subsamples, so we average the variation of the estimator as well. RV t h 
provides the first consistent and asymptotic estimator of the quadratic variation of pt with rate 
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of convergence iV -1 / 6 . Zhang et al. (2005) also provide the theory for optimal choice of K grids, 
K* = cN 2 / 3 , where the constant c can be set to minimize the total asymptotic variance. 

Another estimator, which is able to deal with the noise and which we use for the comparison 
in our study is the realized kernels (RK) estimator introduced by Barndorff-Nielsen et al. (2008). 
The realized kernel variance estimator is defined by 

RVt R h K) = 7t,fc,o + Yl k \ ~~fj j ( 7t -M + ,-ij)> ( 6 ) 



with jt,h,ri = YliLi r t-h+(^)h r t-h+( 1 ^-)h denoting the 77-th realized autocovariance with r] = 
—H, . . . , —1, 0,1,..., H and k{.) denotes the kernel function. Please note that for rj = 0, r yt,h,r) = 
7t,fc,o = RVt,h is estimate of the realized variance from Eq. (3). For the estimator to work, we 
need to choose the kernel function k(.). In our study, we will focus on the Parzen kernel because it 
satisfies the smoothness conditions, k'(0) = k'(l) = and is guaranteed to produce a non-negative 
estimate. The Parzen kernel function is given by 



k(x) 



l-6x 2 + 6x s 0<x<l/2 

2(1 -xf 1/2<x<1. (7) 

x > 1 



We should note that the realized kernel estimator is computed without accounting for end effects, 
i.e. replacing the first and the last observation by local averages to eliminate the corresponding 
noise components (so-called "jittering"). Barndorff-Nielsen et al. (2008) argue that these effects 
are important theoretically, but are negligible practically. 

2.2. Effect of jumps 

By introducing the TSRV and the RK estimators, we will have benchmark estimators which are 
able to consistently estimate the quadratic variation from noisy observations. Still, we are interested 
to decompose quadratic variation into the integrated variance and jump variation component. 
Barndorff-Nielsen and Shephard (2004, 2006) develop a powerful and complete way of detecting 
the presence of jumps in high-frequency data. The basic idea is to compare two measures of the 
integrated variance, one containing the jump variation and the other being robust to jumps and 
hence containing only the integrated variation part. In our work, we use the Andersen et al. 
(2011) adjustment of the original Barndorff-Nielsen and Shephard (2004) estimator, which helps 
render it robust to certain types of microstructure noise. The bipower variation over [t — h, t], for 
< h < t < T, is defined by 

N 

(BV) o N ^ . . , . 

RV t ,h =Mi j^)^\r t _ hH! _2 )h \.\r^ h+{j _ )h \, (8) 

i=3 

where n a = vr/2 = E(\Z\ a ), and Z ~ N(0,1), a > and RV^ 4 f t _ h a 2 s ds. Thus RV^ 

' — -(sparse) 

provides a consistent estimator of the integrated variance and RV t h provides a consistent 
estimator of the quadratic variation. Then, the jump variation can be estimated consistently as 
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the difference between the realized variance and the realized bipower variation: 



(afc^-BCV^** (») 

^ ' 1=1 

Under the assumption of no jump and some other regularity conditions, Barndorff-Nielsen and 
Shephard (2006) provided the joint asymptotic distribution of the jump variation. Under the null 
hypothesis of no within-day jumps, 

Rv ( t :r se) -Rv^ v) 

-^-(sparse) 

Zt,H = RVt > h , (10) 



TQ t , h 



(m 2 +7T-5Umax 1, . 

where TQ t ^ h = ^4/3(^4) £f= 5 \r t _ h+ ^ h \ 4 ^\r t _ h+ ^ h \ 4/3 \r t _ h+ ^ J 4 / 3 is asymptotically 
standard normally distributed. Using this theory, the contribution of the jump variation to the 
quadratic variation of the price process is measured by 

Jt, h = iz t>h >* a {xy[ sp h arse) - Kv { t T ] ) > ( n ) 

where Iz t h >$ a denotes the indicator function and & a refers to the chosen critical value from the 
standard normal distribution. The measure of integrated variance is defined as 

C t ,h = I z t;h <^ a RV < tt h ] + I z t , h ><s> a RV i tyh \ (12) 

ensuring that the jump measure and the continuous part add up to the estimated variance without 
jumps. 

We use the described jump detection methodology as the benchmark and we focus on wavelet 
methods for detecting jumps in the data, as described in the following section. 



3. Estimation of the realized variance using wavelets 

While most time series models are naturally set in the time domain, wavelet transform help us 
to enrich the analysis of realized variance by the frequency domain. It is a logical step to take, 
as the stock markets are believed to be driven by heterogeneous investment horizons, so volatility 
dynamics should be understood not only in time but at investment horizons as well. We will 
introduce general ideas of constructing the estimators here, while we keep the details necessary to 
understand the wavelet theory in the Appendix Appendix A, 

Following Barunik and Vacha (2012), the wavelet-based realized variance over [t — h,t], for 
< h < t < T, is defined by 



J m + 1 N 

EE 

3=1 k=i 



»:r"=EE^ (13) 
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where N is the number of intraday observations in [t — h, t] and J m is the number of scales we 
consider. Wj t _ h+ ± h are the MODWT coefficients, unaffected by boundary conditions, defined 

in Eq.(A.7) on returns data rt t h on components j = 1, . . . , J m + 1, where J m < log 2 N. With 
increasing sampling frequency N — > oo the wavelet-based realized variance estimator is unbiased 
and consistent estimator of the quadratic variation, and it is the same as realized variance estimator; 
for more details see Barunik and Vacha (2012). Still, we assume that observed yt process contains 
noise as well as jumps. Therefore, we need to introduce the concepts which will be able to deal 
with both. 

3.1. Realized jump estimation using wavelets 

Wavelets can also be used for estimating jumps and separating integrated variance from jump 
variation. We assume that the sample path of pt has a finite number of jumps (a.s.). Following 
the theoretical results of Wang ( 1995) on the wavelet jump detection of the deterministic functions 
with i.i.d. additive noise et, we use MODWT as the discretized version of the continuous wavelet 
transform. Unlike the ordinary DWT, the MODWT is not restricted to a dyadic sample length. 
For the estimation of jump location we use the universal threshold (Donoho and Johnstone, 1994) 
on the first level wavelet coefficients of yt over [t — h,t], Wi^. If for some 

|Wi )fc | > dy/2\ogN, (14) 

then 77 = {k} is the estimated jump location with size y~f l+ — y^_ (averages over [77,77 + S n ] and 
[77,77 — S n ], respectively, with 5 n > being the small neighborhood of the estimated jump location 1 
77 ± 5 n ) and where d is median absolute deviation estimator defined as (2 1 / 2 )median{\Wi t k\, k = 
1, . . . . , N }/0.6745 (Percival and Walden, 2000). 

Using the result of Fan and Wang (2007), the jump variation is then estimated by the sum of 
the squares of all the estimated jump sizes: 

N t 

JVt,h = zj3H,h,n+ ~ Vt,h,n-) > ( 15 ) 
1=1 

thus we are able to estimate the jump variation from the process consistently with the convergence 
rate N~ l l A . In the following analysis, we will be able to separate the continuous part of the price 
process containing noise from the jump variation. This result can be found in Fan and Wang (2007) 

and it states that the jump-adjusted process y K t fe = yt t h — JVt,h converges in probability to the 

continuous part without jumps. Thus, if we are able to deal with the noise in y[ J ^, we will be able 
to estimate the IV^h- 

3.2. Jump wavelet two scale realized variance (JWTSRV) estimator 

In the final estimator, we utilize the TSRV estimator of Zhang et al. (2005), the wavelet re- 
alized variance estimator Eq. (13) and the wavelet jump detection method. Final estimator will 
moreover decompose the integrated variance into J m + 1 components so we will be able to study 



x Due to the nature of the MODWT filters, we need to correct the position of the wavelet coefficient to get the 
precise position of the jump. For more details see Percival and Mofjeld (f997). 
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the dynamics of volatility at various investment horizons. 



Following Barunik and Vacha (2012), we define the jump-adjusted wavelet two-scale realized 
variance (JWTSRV) estimator over [t — h, t], for < h < t < T, on the observed jump-adjusted 

data, yJJ = y t , h - El=a j i as: 

j rn ^\ ./ m -(-l — 

-~~~(JWTSRV) ^ ^(JWTSRV) ^ / — W,J) (WHV,J)\ 

Ry t,h = 2^ RV i,t,h = 2^ [ RV 3,t,h - x RV j,t,h J > (is) 



3=1 3=1 



where RV^ t ' h ^ = ^ Yl^=i Ylk=i ~^j t -h+-^h * s obtained from wavelet coefficient estimates on a grid 
of size N = N/G and RV\ t%h ' = J2k=i ^] t _ h+ k h is the wavelet realized variance estimator at 
a scale j on the jump-adjusted observed data, ■ 

The JWTSRV estimator decomposes the realized variance into an arbitrary chosen number of 
investment horizons and jumps. Barunik and Vacha (2012) discuss that it is consistent estimator 
of the integrated variance as it converges in probability to the integrated variance of the process 
pt- Barunik and Vacha (2012) also test the small sample performance of the estimator in a large 
Monte Carlo study and they find that it is able to recover true integrated variance from the noisy 
process with jumps very precisely. They also run a forecasting simulation where JWTSRV estimator 
confirms to improve forecasting of the integrated variance substantially. In small samples, a small 
sample refinement can be constructed (Zhang et al. , 2005): 

w'™^'^!-!) 1 " 1 '^:™. (it) 

When referring to the realized volatility estimated using our JWTSRV estimator, we will refer to 
the yJBV tjh 



4. A forecasting model based on decomposed integrated volatilities 

Similarly to Lanne (2007) and Andersen et al. (2011), we use the decomposition of the quadratic 
variation with the intention of building a more accurate forecasting model. Our approach is very 
different though, as we use wavelets to decompose the integrated volatility into several investment 
horizons and jumps first. Moreover, we employ recently proposed Realized GARCH framework 
of Hansen et al. (2011). Realized GARCH allows to model jointly returns and realized measures 
of volatility, while key feature is a measurement equation that relates the realized measure to the 
conditional variance of returns. We expect that our modification will result in better in-sample fits 
of the data as well as out-of-sample forecasts. 

4-1. Realized GARCH framework for forecasting 

The key object of interest in GARCH family is the conditional variance, h t = var(r t \J-'t-i), 
where r t is a time series of returns. While in a standard GARCH(1,1) model the conditional vari- 
ance, ht is dependent on its past ht-i and r\_ x , Hansen et al. (2011) propose to utilize realized 
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measures of volatility and make ht dependent on them as well. Authors propose so-called mea- 
surement equation which ties the realized measure to latent volatility. The general framework of 
Realized GARCH(p, q) models is well connected to existing literature in Hansen et al. (2011). Here, 
we restrict ourselves to the simple log-linear specification of Realized GARCH(1, 1) with Gaussian 
innovations which we will use to build our model. A simple log-linear Realized GARCH(1, 1) model 
is given by 

n = Vhzt, (18) 

log(h t ) = w + /31og(/i t _i) + 7 log(x t _i) (19) 
log(x t ) = t + ^\og{h t ) + T 1 z t + T 2 z 2 t +u u (20) 

where r t is the return, x% a realized measure of volatility, z t ~ i.i.d(0, 1) and ut ~ i.i.d(0, a 2 ) with 
Zt and Ut being mutually independent, ht = var{rt\Tt-i) with Ft = cr(r t , Xt, rt-i, Xt-i, ■ ■ •) and 
t(z) = T\z t + T2z\ is called leverage function. 

It is worth noting that while we use only this specific version of Realized GARCH, Hansen et al. 
(2011) introduces a general family of models which generalized a GARCH models as it can nest 
any GARCH specification. Also assumption on innovations is not essential and can be changed to 
other common assumptions as Student's t for example. 

Hansen et al. (2011) provide also the asymptotic properties of the quasi-maximum likelihood 
estimator (QMLE henceforth) and propose to use it for the parameter estimation. The structure 
of the QMLE is very similar to that of the standard GARCH model, but we need to accommodate 
also realized measures in the estimation. The log-likelihood function is given by 

T 

log L({r t ,xt}f =1 ;e) =J2^gf(r t ,x t \T t -i). (21) 
t=i 

Standard GARCH models do not have realized measure xt, so we need to factorize the joint 
conditional density 

f(rt,x t \Tt-i) = f(r t \Tt-i)f(x t \r t ,Tt-i). (22) 

and use the partial log-likelihood, £{r) = Ylt=i l°g f( r t l^t-i) when comparing the fits to a standard 
GARCH. For the Gaussian specification of zt and ut, the joint likelihood is then split into the sum 

T T 

£(r, x) = -0.5 (M 2 ?0 + MM + r 2 t /h t ) + -0.5 £ (log(27r) + log(^) + u ^/ a l) ( 23 ) 
t=\ t=i 



i(r) e(x\r) 

while other standard specifications can be used as well in a similar manner. 

Realized GARCH framework is rather general. For example, it allows to accommodate more 
realized measures. In our analysis, we will estimate Realized GARCH(1,1) models using different 
xt, namely RV, BV, TSRV, RK and JWTSRV from previous sections and compare its performance. 

4.2. Realized Wavelet Jump- GARCH (1,1) 

By estimating different Realized GARCH models using various realized measures, we will see 
which measure carries the best information for forecasting of volatility. In addition, we would like 
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to utilize estimated jumps as well as decomposition of JWTSRV and propose two more specifi- 
cations, Realized Jump-GARCH(1,1) model (Realized J-GARCH) and Realized Wavelet Jump- 
GARCH(1,1) model (Realized WJ-GARCH). First, by addition of estimated jumps into the vari- 
ance equation, we obtain Realized J-GARCH(1,1) model given by 



\og{h t ) = to + j3 log(/i£_i) + 7 \og(xt-i) + 7j log(l + JVt-\), 
log(x t ) = £ + i> log(^) + nz t + T 2 z 2 t + Ut, 



(24) 
(25) 
(26) 



(JWTSRV) W 

where xt and JVt is estimated using Eq. (16) and Eq. (15) by our RV t and JV t 

respectively and z% and ut come from Gaussian normal distribution and are mutually independent. 

(JWTSRV) -~~~W 

This model is logical step in generalizing the Realized GARCH structure as RV t and JV t 

add up to a quadratic variation of underlying price process which is not biased by noise. If jumps 
have a significant impact on volatility forecasts, 7j coefficient should be significantly different from 
zero. 

Finally, we utilize a wavelet decomposition of integrated volatility to different investment hori- 

(JWTSRV) 

zons and estimate the model where Xjj will represent RVj t at all estimated investment 

horizons j = 1, J m + 1. The Realized Wavelet Jump- GARCH (1,1) model is given by 

r t = y/hzt, (27) 

J m +i 

\og(h t ) = u + /31og(^_i) + fWj log(x i)t _i) + 7jlog(l + JVt-i), (28) 

Iog(a; t ) = ^ + ^\og{h t ) + T l z t + T 2 z 2 t +u t , (29) 

-(JWTSRV) . W 

where Xjj is estimated using Eq. ( 16 ) by RVj^ , JVt is estimated using Eq. ( 15 ) by JV t 

and zt and u% come from Gaussian normal distribution and are mutually independent. Note that j 

■—^-(JWTSRV) 

components always add up to overall variance xt = RV t . Our last model is motivated by 

the decomposition of realized volatility into several investment horizons. 7^ will provide a good 
guide for significance of various investment horizons. 

All the models are estimated by QMLE and can be easily generalized by assuming different 
distributions of zt and ut ■ We have also tried to incorporate different distributions 2 but the results 
did not change qualitatively and to keep the number of estimated models under control, we report 
the results for the Gaussian case only. 

4-3. Forecast evaluation using different realized variance measures 

To analyze the forecast efficiency and information content of different volatility estimators in 
the Realized GARCH framework, we employ the popular approach of Mincer and Zarnowitz (1969) 
regressions. The regression takes the form: 

V t ^ = a + pV t RG - {k) +e t , (30) 



Results for other cases are available from authors upon request. 
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Table 1: The table summarizes the daily log-return distributions of GBP, CHF and EUR futures. The sample period 
extends from January 5, 2007 through November 17, 2010, accounting for a total of 944 observations. 





Mean 


St.dcv. 


Skew. 


Kurt. 


GBP 


0.0001 


0.0119 


-0.3852 


4.4356 


CHF 


0.0002 


0.0068 


0.2440 


5.4662 


EUR 


0.0002 


0.0099 


0.1536 


4.4951 



with V t+1 being the integrated volatility estimated using the square root of the m-th estimator, 
namely, RV, BV, TSRV, RK and JWTSRV, respectively. V t RG ~ {k) denotes the 1-day ahead forecast 
of V$ using the fc-th estimator based on Realized GARCH(1,1), namely RV, BV, TSRV, RK, 
JWTSRV and finally Realized J-GARCH(1,1) and Realized WJ-GARCH(1,1). We report in-sample 
as well as rolling out-of-sample results. 

After testing the forecasting efficiency of the different volatility models we would also like to test 
the information content of the wavelet decomposition of the realized volatility. For this purpose, 
we separately estimate Realized J-GARCH(1, 1) for all components JWTSRVj for j = 1, . . . , 5 of 
the realized volatility. Finally, we use Heteroskedasticity-adjusted Mean Square Error (HMSE) of 
Bollerslev and Ghysels (1996) and QLIKE of Bollerslev et al. (1994). 

5. Does decomposition bring any improvement in volatility forecasting? 

5.1. Data description 

Foreign exchange future contracts are traded on the Chicago Mercantile Exchange (CME) on a 
24-hour basis. As these markets are among the most liquid, they are suitable for analysis of high- 
frequency data. We will estimate the realized volatility of British pound (GBP), Swiss franc (CHF) 
and euro (EUR) futures. All contracts are quoted in the unit value of the foreign currency in US 
dollars. It is advantageous to use currency futures data for the analysis instead of spot currency 
prices, as they embed interest rate differentials and do not suffer from additional microstructure 
noise coming from over-the-counter trading. The cleaned data are available from Tick Data, Inc. 3 

It is very important to look first at the changes in the trading system before we proceed with 
the estimation on the data. In August 2003, for example, the CME launched the Globex trading 
platform, and for the first time ever in a single month, the trading volume on the electronic 
trading platform exceeded 1 million contracts every day. On Monday, December 18, 2006, the 
CME Globex(R) electronic trading platform started offering nearly continuous trading. More 
precisely, the trading cycle became 23 hours a day (from 5:00 pm on the previous day until 4:00 
pm on current day, with a one-hour break in continuous trading), from 5:00 pm on Sunday until 
4:00 pm on Friday. These changes certainly had a dramatic impact on trading activity and the 
amount of information available, resulting in difficulties in comparing the estimators on the pre- 
2003 data, the 2003-2006 data and the post-2006 data. For this reason, we restrict our analysis to 
a sample period extending from January 5, 2007 through November 17, 2010, which contains the 
most recent financial crisis. The futures contracts we use are automatically rolled over to provide 
continuous price records, so we do not have to deal with different maturities. 



3 http://www. tickdata.com/ 
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(a) GBP futures 

Daily returns 



(b) CHF futures 

Daily returns 



(c) EUR futures 

Daily returns 




Figure 1: Daily returns, estimated jump variation and IVt estimated by JWTSRV for (a) GBP, (b) CHF and (c) 
EUR futures. 



The tick-by-tick transactions are recorded in Chicago Time, referred to as Central Standard 
Time (CST). Therefore, in a given day, trading activity starts at 5:00 pm CST in Asia, continues in 
Europe followed by North America, and finally closes at 4:00 pm in Australia. To exclude potential 
jumps due to the one- hour gap in trading, we redefine the day in accordance with the electronic 
trading system. Moreover, we eliminate transactions executed on Saturdays and Sundays, US 
federal holidays, December 24 to 26, and December 31 to January 2, because of the low activity 
on these days, which could lead to estimation bias. Finally, we are left with 944 days in the 
sample. Looking more deeply at higher frequencies, we find a large amount of multiple transactions 
happening exactly at the same time stamp. We use the arithmetic average for all observations with 
the same time stamp. Table 1 presents the summary statistics for the daily log-returns of GBP, 
CHF and EUR futures over the sample period, t = 1, . . . , 944, i.e., January 5, 2007 to November 
17, 2010. The summary statistics display an average return very close to zero, skewness, and excess 
kurtosis which is consistent with the large empirical literature. 

Having prepared the data, we can estimate the integrated volatility using different estimators 
and use them within proposed forecasting framework. For each futures contract, the daily inte- 
grated volatility is estimated using the square root of realized variance estimator of Andersen et al. 
(2003), the bipower variation estimator of Barndorff-Nielsen and Shephard (2004), the two-scale 
realized volatility of Zhang et al. (2005), the realized kernel of Barndorff-Nielsen et al. (2008) de- 
scribed in the Section 2. Finally, we utilize our jump wavelet two-scale realized variance estimator 
defined by Eq. (16). All the estimators are adjusted for small sample bias. For convenience, we 
refer to the estimators in the description of the results as RV, BV, TSRV, RK and JWTSRV, 
respectively. The RV and BV estimates are estimated on 5-min log-returns. The TSRV and the 
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GBP decomposed volatility CHF decomposed volatility EUR decomposed volatility 




2007 2008 2009 2010 2007 2008 2009 2010 2007 2008 2009 2010 



Figure 2: Decomposed annualized volatility (by 252 days) of GBP, CHF and EUR futures using JWTSRV, (a) 
volatility on investment horizon of 10 minutes, (b) volatility on investment horizon of 20 minutes, (c) volatility on 
investment horizon of 40 minutes, (d) volatility on investment horizon of 80 minutes, (e) volatility on investment 
horizon up to 1 day. Note that sum of components (a), (b), (c), (d) and (e) give total volatility. 

JWTSRV are estimated using a slow time scale of 5 minutes. 

The decomposition of volatility into the so-called continuous and jump part is depicted by Fig- 
ure 1 , which provide the returns, estimated jumps and finally integrated variances using JWTSRV 
estimator for all three futures pairs. Figure 2 shows the further decomposition into several invest- 
ment horizons. For better illustration, we annualize the square root of the integrated variance in 
order to get the annualized volatility and we compute the components of the volatility on several 
investment horizons. Figure 2 (a) to (e) show the investment horizons of 10 minutes, 20 minutes, 40 
minutes, 80 minutes and up to 1 day, respectively. It is very interesting that most of the volatility 
(around 50%) comes from the fast, 10-minute investment horizon which is a new insight. In fact, 
it is a logical finding, as it shows that volatility is created on fast scales of up to 10 minutes rather 
than on slower scales. The longer the horizon, the lower the contribution of the variance to the 
total variation. 

5.2. Forecasting results 

We present the main results of estimation and forecasting here. The estimation strategy is as 
follows. For each of three forex futures considered, namely GBP, CHF and EUR, we first estimate 
benchmark GARCH(1,1) model. Then, we estimate the Realized GARCH of Hansen et al. (2011). 
It is important to note that we would like to compare performance of the model with several realized 
volatility measures, namely RV, BV, RK, TSRV and JWTSRV. Finally, we add our Realized Jump- 
GARCH model and Realized Wavelet Jump-GARCH model. We use the period from January 5, 
2007 to February 2, 2010 for estimation of all the models. Thus, we refer to this period as the 
in-sample period. The rest of the year 2010 is saved for comparison of the out-of-sample forecasts 
on a rolling basis. We use open-to-close returns as well as open-to-close realized measures in the 
analysis. 
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GARCH RG-RV RG-BV RG-RK RG-TSRV RG-JWTSRV RJ-G RWJ-G 




Figure 3: Scattered plot of ht on x t mapped into probability integral transform (PITs) for all different models. 
Rows contain estimates of GBP, CHF and EUR futures, while columns contain GARCH(1,1), Realized GARCH(1,1) 
estimates using RV, BV, RK, TSRV, JWTSRV denoted as RG-RV, RG-BV, RG-RK, RG-TSRV, RG-JWTSRV, 
and finally Realized Jump-GARCH(l,f ) and Realized Wavelet Jump-GARCH(f ,1) denoted as RJ-G and RWJ-G 
respectively. 

Tables 2, 3 and 4 contain all results for GBP futures, CHF futures and EUR futures respectively. 
By observing partial log-likelihood i(r), we can see immediately that all the Realized GARCH 
models reported by the second, third, fourth, fifth and sixth columns bring significant improvement 
to the GARCH(1,1) model reported by the first column (in testing significance of the difference, 
we restrict ourselves to use simple log-likelihood ratio test). When we focus on comparison of 
Realized GARCH (1,1) with different realized measures xt, we observe further significant differences. 
This points to importance of usage proper realized measure. While the simplest measure RV is 
contammined with noise and jumps, we expect the worst performance for the model which uses it 
as a realized measure proxy. BV is robust to jumps and RK with TSRV are robust to noise. Finally, 
our JWTSRV estimator is robust to both jumps and noise in the realized variance so we expect 
the best performance of model which uses JWTSRV. Looking at the results, all the parameter 
estimates for the different realized measures are similar to each other, while log-likelihoods £(r, x) 
uncover rather large differences between the models. In all three currency futures used in this 
study, Realized GARCH (1,1) model with JWTSRV realized measure performs significantly better 
than in RV, BV, RK and TSRV cases. Its log-likelihood brings the largest improvement to all other 
models. Models with RV, BV and TSRV are more or less on the similar levels of the log-likelihood, 
while surprisingly the model with RK measure of realized variance is far worst in all cases. 

Figure 3 compares the latent volatility ht and measured volatility xt from all models. It brings 
further insight into the various fits and it confirms our findings. When compared to RV, BV, RK 
and TSRV measures, we can see that relationship between ht and xt is strongest for our last three 
models based on the JWTSRV measure. Moreover the plot for RK explains why it performs so 
badly in comparison to other estimators. Figure 4 shows the scattered plots of residuals zt and ut 
and confirms a good specification of all models. 

Knowing that Realized GARCH (1,1) with the JWTSRV measure performs far best in all cases 

17 



RG-RV RG-BV RG-RK RG-TSRV RG-JWTSRV RJ-G RWJ-G 




Figure 4: Scattered plot of z t on ut residuals mapped into probability integral transform (PITs) obtained for different 
models. Rows contain estimates of GBP, CHF and EUR futures, while columns contain Realized GARCH(1,1) 
estimates using RV, BV, RK, TSRV, JWTSRV denoted as RG-RV, RG-BV, RG-RK, RG-TSRV, RG-JWTSRV, 
and finally Realized Jump-GARCH(l,f ) and Realized Wavelet Jump-GARCH(f ,1) denoted as RJ-G and RWJ-G 
respectively. 

and improves the log-likelihood supports our further modifications. Motivated by these results, we 
study if inclusion of jumps in the model improves the fits in our newly proposed Realized Jump- 
GARCH(1,1) model denoted as Realized J-GARCH in the Tables. Realized J-GARCH model brings 
further significant improvement in the log-likelihood in all cases, while 7j coefficient is significantly 
different from zero in the case of CHF and EUR, but can not be statistically distinguished from 
zero in the case of GBP. The only reason we can see is that in case of GBP futures the estimated 
jump variation is lowest in comparison to other currencies used so it does not play significant role 
in forecasts. Still, we can conclude that jumps bring significant improvement in the modeling and 
Realized Jump-GARCH(1,1) using JWTSRV outperforms other models. 

As the last step, we would like to utilize the realized variance decomposition of JWTSRV as we 
expect that it will further improve the forecasts. Our motivation is straightforward. We would like 
to find out if the different investment horizons bring improvement to the volatility forecasts. For 
this, we utilize our newly proposed Realized Wavelet Jump- G ARCH (1,1) model (Eq. 27) denoted 
as Realized WJ-GARCH, where we include all estimated components of realized variation, i.e. 
all different components of JWTSRV and jumps. Results for all three currencies are strikingly 
conclusive for the Realized WJ-GARCH model. It brings further improvement in the both full 
and partial log-likelihoods. Interestingly, yw 1 and jw s coefficients are significantly different from 
zero in all three cases, while 7w 5 is significantly different from zero in two cases and 7w 2 in one 
case. Significance of jump term does not change. This points us to the conclusion that wavelet 
decomposition brings not only improvement in overall performance, but it can also be utilized in 
improvement of the volatility modeling. 
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The resting five models in the Tables 2, 3 and 4 are Realized Jump-GARCH(1,1) on JWTSRV 
decompositions separately. Specifically, we use JWTSRVj for xt to see the contribution of different 
decompositions. These last models confirm the previous findings. Interestingly, when only first 
component (investment horizon of 10 minutes) is used, the model brings very similar performance 
to the best Realized WJ-GARCH model. This confirms the intuition that most of the information 
is carried within this scale. 

Until now, we have been focusing on in-sample results. Turing our attention to the out-of- 
sample results we can see that they confirm our findings from the in-sample estimation. Realized 
WJ-GARCH model improves out-of-sample forecasts in terms of R 2 from the Mincer-Zarnowitz 
regression substantially in comparison to all other models, while HMSE and QLIKE confirms this 
result. We also note that out-of-sample forecasts are very accurate as (3 from the Mincer-Zarnowitz 
regressions is very close to 1 . It is worth noting that using other realized measures this is not always 
the case. Still, a shows some bias in the forecasts, especially in case of GBP and CHF currencies. 
This bias can be contributed to the period we choose for forecasting. 

6. Conclusion 

In this paper, we propose a forecasting model based on decomposed integrated volatilities and 
jumps. This model utilizes a jump wavelet two scale realized volatility estimator which measures 
volatility in the time-frequency domain, and recently proposed Realized GARCH model. While 
the JWTSRV estimator is able to consistently estimate jumps from the price process, it can also be 
used to decompose the volatility into several investment horizons. This motivates us to propose a 
new Realized GARCH models including the jumps as well as volatility decomposed into arbitrarily 
chosen number of investment horizons. 

After the introduction of wavelet-based estimation of quadratic variation and all the estimators 
used in the study, we build a new Realized Jump-GARCH and Realized Wavelet Jump-GARCH 
models for volatility forecasting. We compare our estimators to several most popular estimators, 
namely, realized variance, bipower variation, two-scale realized volatility and realized kernels in 
the forecasting exercise using Realized GARCH framework. Usage of the wavelet-based estimator 
proves to bring significant improvement in the volatility forecasts. Model incorporating jumps 
improves forecasting ability even more. 

Concluding the empirical findings, we show that our wavelet-based estimators brings a sig- 
nificant improvement to volatility estimation and forecasting. It also offers a new method of 
time-frequency modeling of realized volatility which helps us to better understand the dynamics of 
stock market behavior. Specifically, our theory uncovers that most of the volatility is created on 
higher frequencies. 
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Appendix A. The maximal overlap discrete wavelet transform 



The maximal overlap discrete wavelet transformation (MODWT) is a translation-invariant 
type of discrete wavelet transformation, i.e., it is not sensitive to the choice of starting point of the 
examined process. Furthermore, the MODWT does not use a downsampling procedure as in the 
case of the discrete wavelet transform 4 (DWT), so the wavelet and scaling coefficient vectors at all 
levels (scales) have equal length. As a consequence, the MODWT is not restricted to sample sizes 
that are powers of two. This feature is very important for the analysis of real market data, since 
this limitation is usually too restrictive. In the literature the MODWT is also called the stationary 
wavelet transform, the translation invariant transform and the undecimated wavelet transform. 
For more details about the MODWT see Mallat (1998), Percival and Walden (2000) and Gencay 
et al. (2002). 

The MODWT is a very convenient tool for variance and energy analysis of a time series in the 
time-frequency domain. Percival (1995) demonstrates the advantages of the MODWT estimator 
of variance over the DWT estimator, Serroukh et al. (2000) analyze the statistical properties of 
the MODWT variance estimator for non-stationary and non-Gaussian processes. 

Appendix A. I. Definition of MODWT filters 

First, let us introduce the MODWT scaling and wavelet filters gi and hi, I = 0, 1, . . . , L — 1, 
where L denotes the length of the wavelet filter. For example, the Daubechies D(4) wavelet filter 
has length L = 4 (Daubechies, 1992). Generally, the scaling filter is a low-pass filter whereas the 
wavelet filter is a high-pass filter. There are three basic properties that both the MODWT filters 
must fulfill. Let us show these properties for the MODWT wavelet filter: 

L-1 L-1 oo 

^^ = 0,^^ = 1/2, 2 Ml+2N = 0, N G Z N , (A.l) 

1=0 1=0 l=-oo 

and for the MODWT scaling filter: 

L-1 L-1 oo 

2^ = 1, 2 9l = 1/2, 2 9WI+2N = 0, N G Z N . (A.2) 

1=0 1=0 l=-oo 



The transfer function of a MODWT filter {hi} at frequency / is defined via the Fourier transform 
as: 

oo L—l 

H(f) = 2 hie~ i2 ^ 1 = 2 he-*** 1 , (A.3) 

2=-oo 1=0 

with the squared gain function defined as: %(f) = |-ff(/)| 2 . 



Appendix A.2. Pyramid algorithm 

We get the MODWT wavelet and scaling coefficients using the pyramid algorithm. The wavelet 
coefficients at the first scale (j = 1) are obtained via filtering Xi on % = 1, . . . , N with the MODWT 



4 For a definition and detailed discussion of the discrete wavelet transform see Mallat ( 1998), Percival and Walden 
(2000) and Gengay et al. (2002) 
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wavelet and scaling niters (Percival and Walden, 2000): 



L-1 L-1 



Wl,fc = E h l X k-lmodNi Vl,k = E 9lXk-lmodN- (A. 4) 



1=0 1=0 



For the second stage of the algorithm, we replace x t with the scaling coefficients and after the 
filtering we get wavelet coefficients at scale j = 2 as: 



L-1 L-1 



W^2,k = E hlVi t k-lmodN, V 2> k = E 9lVl,k-lmodN- (A. 5) 



1=0 1=0 



After two stages of the pyramid algorithm we have two vectors of the MODWT wavelet coefficients 
Wi, W2 and one vector of the MODWT wavelet scaling coefficients at scale two V2. Vector Wi 
represents wavelet coefficients at the frequency band / G [1/4,1/2], W2: / G [1/8,1/4] and V2: 
/ G [0, 1/8]. The j-th level MODWT coefficients are in the form: 



L-1 L-1 
= E hlVj-i jk -i modN , Vj ik = E 9lVj-l,k-lmodN, j = 1, 2, . . . , J m . (A.6) 
1=0 1=0 

where J m < log2(N). Generally, the j-th level wavelet coefficients in the vector Wj represents 
frequency bands / G [l/2- ?+1 , 1/2- 7 ] wheres the j-th level scaling coefficients in the vector Vj 
represents / G [0, 1/2 J+1 ]. For estimation of the wavelet covariance we use MODWT wavelet 
coefficients unaffected by the boundary conditions. For simplicity in notation let us we define a 
vector W that consists of J m + 1 A^— dimensional subvectors, where the first J m subvectors are the 
MODWT wavelet coefficients at levels j = 1, J m and the last subvector consists of the MODWT 
scaling coefficients at a level J m : 

W=[W 1 ,W 2l ..,W Jra ,V Jm ] T . (A.7) 

Appendix A. 3. Wavelet decomposition of a stochastic process 

For our analysis, it is important to show that we are able to decompose the variance (energy) 
of a stochastic process on a scale- by-scale basis, i.e., we can get the variance contribution of every 
level j, with the maximum level of decomposition J m < log 2 N. The (total) variance of the time 
series Xj, i = 1,...,N can be decomposed on a scale-by-scale basis so that 



Ixll 2 



jm J m + 1 

Eii w ^ 2 + ii v ^ii 2 = E ii^-ii 2 - ( A - 8 ) 
3=1 3=1 



where ||x|| 2 = ^=1 x l W W j\\ 2 = £2=i W j,v l|Vj-|| 2 = E*=i Vjm^ and Wj and V,- are N dimen- 
sional vectors of the j-th level MODWT wavelet and scaling coefficients. The proof of the variance 
decomposition, Eq. (A.8), using the MODWT can be found in Percival and Mofjeld (1997) and 
Percival and Walden (2000). 

It is worth noting that the squared norm ||.|| is similar to the realized measure discussed in the 
preceding sections. For example, in the case of the realized variance estimator (RV) the energy 
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decomposition can reveal the contributions of particular scales to the overall energy, hence we can 
see what form this realized measure takes. 
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